Rladies Melbourne for the invitation to speak
Rladies community - welcoming and inclusive
background-image: url(“https://media.giphy.com/media/f4FKTFwMXn1za/giphy.gif”) background-position: 50% 50% class: center, inverse
Inspired and confident to dive in!
In the early 1980’s, LaTex was released.
Latex is a document preparation system.
Plain text + markup = defined structure (article, letter, bibliography)
Perfect for those who really care about typeface:
Kerning is the process of selectively adjusting the spacing between letters pairs to improve the overall appearance of text.
Early 2000’s, along came Sweave.
A function that enables integration of R code into LaTex documents
The purpose is “to create dynamic reports, which can be updated automatically if data or analysis change”. [1]
.footnote[ [1] Leisch, Friedrich (2002). “Sweave, Part I: Mixing R and LaTeX: A short introduction to the Sweave file format and corresponding R functions” (PDF). R News. 2 (3): 28–31.]
GNU 1. Create a makefile 2. type make
See Karl Broman (https://kbroman.org/minimal_make/) for a fantastic tutorial.
knitr is Sweave reborn!
knitr can produce html, pdf, and word
Rmarkdown has a restricted set of commands, and there is no way to create custom commands, however custom LaTeX can be included
A flavour of Markdown specifically for R.
Rmarkdown - render + knit to html
Three main sections: * YAML header * code chunks * markdown text
YAML is human-friendly, cross language, plain text.
This block allows you to fine-tune the output of your document. YAML metadata allows for: - TOC, tabbed sections, theme, highlight - allows for custom CSS - can evaluate R expressions, e.g. Sys.time()
three backticks{r chunk_name, options}
code!
three backticks
Common options: * include (FALSE) - prevents code and results from appearing * echo (FALSE) - include results (e.g. figures) but exclude the code * message (FALSE) - prevents messages generated by code from appearing * warning (FALSE) - as above but for warnings * fig.cap - add captions to graphics —
Easy within RStudio:
File -> New File -> R Markdown
Chunks:
* Infrastructure - environment (e.g. libraries), loading data, defining analysis parameters * Wrangling - code to transform data * Communication - e.g. data visualization, summary tables
Do not hardcode paths! - use here
Do not hardcode values! - use parameters
Do not do everything in Rmarkdown (e.g. database queries)
Reduce duplication with functions
“If you can type words, you can use bookdown”
@CivicAngela, RLadiesChicago
.pull-left[ ]
.pull-right[ ]
Helps scientists organise their research in a way that promotes: - reproducibility - collaboration/sharing of results - effective project management
Combines literature programming and version control
Final result:
A website, containing time-stamped, versioned, and documented results
wflow_view()
wflow_build()
# makes the .html files from the .Rmd files
wflow_view()
wflow_status()
wflow_publish(c("analysis/index.Rmd", "analysis/about.Rmd", "analysis/license.Rmd"),
"Publish the initial files for MyProject")
wflow_status()
wflow_use_github("yourGitHub_username", "MyProject")
# create the GitHub repository MyProject
wflow_git_push(dry_run = TRUE)
# ok?!
wflow_git_push()
# do some stuff
wflow_build()
# makes the .html files from the .Rmd files
wflow_view()
wflow_status()
wflow_publish(c("analysis/index.Rmd", "analysis/about.Rmd", "analysis/license.Rmd"),
"Publish the initial files for MyProject")
wflow_status()
wflow_use_github("yourGitHubusername", "MyProject")
# create the GitHub repository MyProject
wflow_git_push(dry_run = TRUE)
# ok?!
wflow_git_push()
Once you are comfortable with the basics, got nuts with customisation.
e.g. https://github.com/timtrice/workflowr_skeleton
Sharing tidy, standardized, reproducible data sets for publications and collaborations can be challenging.
read https://ropensci.org/blog/2018/09/18/datapackager/
Caveats - see https://github.com/ropensci/DataPackageR - e.g. small size of data
(HT @_ColinFay)
Defining production:
“Software environments that are used and relied on by real users with real consequences if things go wrong” - @_ColinFay
“Production is anything that is run repeatedly and that the business relies on”
Log into the AWS Management Console:
Make use of the multicores:
install.packages(c(‘doMC’, ‘foreach’)) library(foreach) library(doMC) doMC::registerDoMC(cores = detectCores()) detectCores() foreach::getDoParWorkers() https://blog.sicara.com/speedup-r-rstudio-parallel-cloud-performance-aws-96d25c1b13e2